1,279 research outputs found

    Deep latent-variable models for neural text generation

    Get PDF
    Text generation aims to produce human-like natural language output for down-stream tasks. It covers a wide range of applications like machine translation, document summarization, dialogue generation and so on. Recently deep neural network-based end-to-end architectures are known to be data-hungry, and text generated from them usually suffer from low diversity, interpretability and controllability. As a result, it is difficult to trust the output from them in real-life applications. Deep latent-variable models, by specifying the probabilistic distribution over an intermediate latent process, provide a potential way of addressing these problems while maintaining the expressive power of deep neural networks. This presentation will explain how deep latent-variable models can improve over the standard encoder-decoder model for text generation. We will start from an introduction of encoder-decoder and deep latent-variable models, then go over popular optimization strategies, and finally elaborate on how latent variable models can help improve the diversity, interpretability and data efficiency in different applications of text generation tasks.Textgenerierung zielt darauf ab, eine menschenähnliche Textausgabe in natürlicher Sprache für Anwendungen zu erzeugen. Es deckt eine breite Palette von Anwendungen ab, wie maschinelle Übersetzung, Zusammenfassung von Dokumenten, Generierung von Dialogen usw. In letzter Zeit werden dafür hauptsächlich Endto- End-Architekturen auf der Basis von tiefen neuronalen Netzwerken verwendet. Der End-to-End-Ansatz fasst alle Submodule, die früher nach komplexen handgefertigten Regeln entworfen wurden, zu einer ganzheitlichen Codierungs- Decodierungs-Architektur zusammen. Bei ausreichenden Trainingsdaten kann eine Leistung auf dem neuesten Stand der Technik erzielt werden, ohne dass sprach- und domänenabhängiges Wissen erforderlich ist. Deep-Learning-Modelle sind jedoch als extrem datenhungrig bekannt und daraus generierter Text leidet normalerweise unter geringer Diversität, Interpretierbarkeit und Kontrollierbarkeit. Infolgedessen ist es schwierig, der Ausgabe von ihnen in realen Anwendungen zu vertrauen. Tiefe Modelle mit latenten Variablen bieten durch Angabe der Wahrscheinlichkeitsverteilung über einen latenten Zwischenprozess eine potenzielle Möglichkeit, diese Probleme zu lösen und gleichzeitig die Ausdruckskraft tiefer neuronaler Netze zu erhalten. Diese Dissertation zeigt, wie tiefe Modelle mit latenten Variablen Texterzeugung verbessern gegenüber dem üblichen Encoder-Decoder-Modell. Wir beginnen mit einer Einführung in Encoder-Decoder- und Deep Latent Variable-Modelle und gehen dann auf gängige Optimierungsstrategien wie Variationsinferenz, dynamische Programmierung, Soft Relaxation und Reinforcement Learning ein. Danach präsentieren wir Folgendes: 1. Wie latente Variablen Vielfalt der Texterzeugung verbessern können, indem ganzheitliche, latente Darstellungen auf Satzebene gelernt werden. Auf diese Weise kann zunächst eine latente Darstellung ausgewählt werden, aus der verschiedene Texte generiert werden können. Wir präsentieren effektive Algorithmen, um gleichzeitig das Lernen der Repräsentation und die Texterzeugung durch Variationsinferenz zu trainieren. Um die Einschränkungen der Variationsinferenz bezüglich Uni-Modalität und Inkonsistenz anzugehen, schlagen wir eine Wake-Sleep-Variation und ein auf Transinformation basierendes Trainingsziel vor. Experimente zeigen, dass sie sowohl die übliche Variationsinferenz als auch nicht-latente Variablenmodelle bei der Dialoggenerierung übertreffen. 2. Wie latente Variablen die Steuerbarkeit und Interpretierbarkeit der Texterzeugung verbessern können, indem feinkörnigere latente Spezifikationen zum Zwischengenerierungsprozess hinzugefügt werden. Wir veranschaulichen die Verwendung latenter Variablen für Wortausrichtung, Inhaltsauswahl, Textsegmentierung und Feldsegmentkorrespondenz. Wir leiten für sie effiziente Trainingsalgorithmen ab, damit die Texterzeugung explizit gesteuert werden kann, indem die latente Variable, die durch ihre Definition vom Menschen interpretiert werden kann, manipuliert wird. 3. Überwindung der Seltenheit von Trainingsmustern durch Behandlung von nicht parallelem Text als latente Variablen. Das Training kann wie beim Standard-EM-Algorithmus durchgeführt werden, der stabil konvergiert. Wir zeigen, dass es bei der Dialoggenerierung erfolgreich angewendet werden kann und den Generierungsraum durch die Verwendung von nicht-konversativem Text erheblich bereichert

    Improving Variational Encoder-Decoders in Dialogue Generation

    Full text link
    Variational encoder-decoders (VEDs) have shown promising results in dialogue generation. However, the latent variable distributions are usually approximated by a much simpler model than the powerful RNN structure used for encoding and decoding, yielding the KL-vanishing problem and inconsistent training objective. In this paper, we separate the training step into two phases: The first phase learns to autoencode discrete texts into continuous embeddings, from which the second phase learns to generalize latent representations by reconstructing the encoded embedding. In this case, latent variables are sampled by transforming Gaussian noise through multi-layer perceptrons and are trained with a separate VED model, which has the potential of realizing a much more flexible distribution. We compare our model with current popular models and the experiment demonstrates substantial improvement in both metric-based and human evaluations.Comment: Accepted by AAAI201

    NEXUS Network: Connecting the Preceding and the Following in Dialogue Generation

    Full text link
    Sequence-to-Sequence (seq2seq) models have become overwhelmingly popular in building end-to-end trainable dialogue systems. Though highly efficient in learning the backbone of human-computer communications, they suffer from the problem of strongly favoring short generic responses. In this paper, we argue that a good response should smoothly connect both the preceding dialogue history and the following conversations. We strengthen this connection through mutual information maximization. To sidestep the non-differentiability of discrete natural language tokens, we introduce an auxiliary continuous code space and map such code space to a learnable prior distribution for generation purpose. Experiments on two dialogue datasets validate the effectiveness of our model, where the generated responses are closely related to the dialogue context and lead to more interactive conversations.Comment: Accepted by EMNLP201

    Energy Savings Potential of Phase Change Material Integrated Building Envelope in South Texas

    Get PDF
    The building sector accounts for about 41% of the primary energy consumption, 71% of the electricity usage and 38% of the CO2 emissions in the U.S. Therefore, it is critical to develop efficient solutions to reduce its energy consumption and environmental impact. Building envelope plays an important role in energy use of buildings: it separates the indoor from the outdoor environment, regulates the heat flux entering the inside of buildings and ensures comfort for the occupants. Carefully designed high performance building envelope can reduce total energy consumption by a big portion. Among the several effective envelope techniques, incorporation of phase change materials (PCMs) have received considerable attention during the past decades. PCMs are substances change phase at a certain temperature range with capabilities of storing and releasing large amounts of energy. Being an integrated component of envelope, they absorb heat when outdoor temperature rises and solar energy strikes the building until complete phase change occurred. When surrounding temperature is lower than their phase-change temperature, heat will then flow out of PCMs and reverse phase change occurs. Previous researches have shown that PCMs integrated building envelopes can mitigate indoor air temperature swings, decrease cooling loads and greatly reduce or shift the building peak loads. Most PCMs evaluated for envelope integration change phase from solid to liquid at a temperature around acceptable comfort range of 21 oC to 27 oC. For locations with moderate climate, a reduction in annual residential cooling energy consumption about 10% can be expected. Nevertheless, the effectiveness of PCMs in envelope is strongly influenced by several factors including climate, location of PCM in the wall, amount of PCM, PCM melting and freezing temperatures, wall orientation, etc. Texas is a highly populated and energy-intensive state with long and hot summer. Residential buildings in the state typically have light constructions – favorable for PCMs integration in envelope – and present low energy efficiency. Therefore, a comprehensive study of PCMs integrated residential building envelope is needed for the location. This paper investigates the impact of PCMs integrated building envelope on energy consumption and thermal comfort of residential buildings in south Texas. A prototype single-family building is used as case study. Annual simulation is performed in EnergyPlus to obtain yearly total energy consumption, yearly energy consumption on space heating and cooling. Results for cases with and without PCMs integrated envelope will be compared to analyze the feasibility and energy savings performance of this technique, taking the type of PCM, PCM location in the wall, amount of PCM and application orientation into consideration. Additionally, improvement of indoor thermal comfort will also be presented by evaluating the mean radiant temperature

    Cash Demand and Financing Decisions

    Get PDF
    Recent literature starts to focus on the effects of the urgency of cash demand on the choice of financing sources. Extant studies use data from the U.S. and conclude that firms use debt financing to meet immediate cash demand and equity financing to meet longer-term cash demand. Using data from China, this paper uncovers opposite findings: Firms are more likely to use equity financing to meet immediate cash demand and debt financing to meet cash demand in longer terms. We discuss the possible mechanisms behind the pattern
    • …
    corecore